Machine learning of probabilistic phonological pronunciation rules from the Italian CLIPS corpus
نویسندگان
چکیده
A blending of phonological concepts and technical analysis is proposed to yield a better modeling and understanding of phonological processes. Based on the manual segmentation and labeling of the Italian CLIPS corpus we automatically derive a probabilistic set of phonological pronunciation rules: a new alignment technique is used to map the phonological form of spontaneous sentences onto the phonetic surface form. A machine-learning algorithm then calculates a set of phonological replacement rules together with their conditional probabilities. A critical analysis of the resulting probabilistic rule set is presented and discussed with regard to regional Italian accents. The rule set presented here is also applied in the newly published web-service WebMAUS that allows a user to segment and phonetically label Italian speech via a simple web-interface.
منابع مشابه
A Transformation-Based Learning Method on Generating Korean Standard Pronunciation
In this paper, we propose a Transformation-Based Learning (TBL) method on generating the Korean standard pronunciation. Previous studies on the phonological processing have been focused on the phonological rule applications and the finite state automata (Johnson 1984; Kaplan and Kay 1994; Koskenniemi 1983; Bird 1995). In case of Korean computational phonology, some former researches have approa...
متن کاملIndependent automatic segmentation by self-learning categorial pronunciation rules
The goal of this paper is to present a new method to automatically generate pronunciation rules for automatic segmentation of speech the German MAUSER system. MAUSER is an algorithm which generates pronunciation rules independently of any domain dependent training data either by clustering and statistically weighting self-learned rules according to a small set of phonological rules clustered by...
متن کاملAn interactive English pronunciation dictionary for Korean learners
We present research towards developing a pronunciation dictionary that features sensitivity to learners’ native phonology, specifically designed for Korean learners of English-as-a-Foreign-Language (EFL). We envision a future system that can record and process learners’ imitation of the dictionary pronunciation and instantly provide segmental and prosodic feedback on accent. Towards this goal, ...
متن کاملAutomatic derivation of phonological rules for mispronunciation detection in a computer-assisted pronunciation training system
Computer-Assisted Pronunciation Training System (CAPT) has become an important learning aid in second language (L2) learning. Our approach to CAPT is based on the use of phonological rules to capture language transfer effects that may cause mispronunciations. This paper presents an approach for automatic derivation of phonological rules from L2 speech. The rules are used to generate an extended...
متن کاملBuilding multiple pronunciation models for novel words using exploratory computational phonology
In this paper we describe a completely automatic algorithm that builds multiple pronunciation word models by expanding baseform pronunciations with a set of candidate phonological rules. We show how to train the probabilities of these phonological rules, and how to use these probabilities to assign pronunciation probabilities to words not seen in the training corpus. The algorithm we propose is...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013